Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Disguised voice detection method based on inverted Mel-frequency cepstral coefficient

LIN Xiaodan, QIU Yingqiang

Journal of Computer Applications 2019, 39 (12): 3510-3514. DOI: 10.11772/j.issn.1001-9081.2019050870

Abstract （285）

PDF （825KB）（236）

Save

Voice disguise through pitch shift is commonly used to conceal the identity of speaker. A bunch of voice changers substantially facilitate the application of voice disguise. To simultaneously address the problem of whether a speech signal is pitch-shifted and how it is modified (pitch-raised or pitch-lowered), with the traces of the electronic disguised voice in the signal spectrum especially the high frequency region analyzed, an electronic disguised voice detection method based on statistical moment features derived from Inverted Mel-Frequency Cepstral Coefficient (IMFCC) was proposed. Firstly, IMFCC and its first-order difference of each voice frame were extracted. Then, its statistical mean was calculated. Finally, on the above statistical feature, the design of Support Vector Machine (SVM) multi-classifier was used to identify the original voice, the pitch-raised voice and the pitch-lowered voice. The experimental results on TIMIT and NIST voice datasets show that the proposed method has satisfactory performance on the detection of the original, pitch-raised and pitch-lowered voice signals. Compared with the baseline system using MFCC as feature construction, the method with the proposed features has significantly increased the recognition rate of the disguise operation. And the method outperforms the Convolutional Neural Network (CNN) based framework when limited training data is available. The extensive experiments demonstrate the proposed has good generalization ability on different datasets and different disguising methods.

Reference | Related Articles | Metrics